Zagreb
Explainable Fundus Image Curation and Lesion Detection in Diabetic Retinopathy
Diabetic Retinopathy (DR) affects individuals with long-term diabetes. Without early diagnosis, DR can lead to vision loss. Fundus photography captures the structure of the retina along with abnormalities indicative of the stage of the disease. Artificial Intelligence (AI) can support clinicians in identifying these lesions, reducing manual workload, but models require high-quality annotated datasets. Due to the complexity of retinal structures, errors in image acquisition and lesion interpretation of manual annotators can occur. We proposed a quality-control framework, ensuring only high-standard data is used for evaluation and AI training. First, an explainable feature-based classifier is used to filter inadequate images. The features are extracted both using image processing and contrastive learning. Then, the images are enhanced and put subject to annotation, using deep-learning-based assistance. Lastly, the agreement between annotators calculated using derived formulas determines the usability of the annotations.
- Europe > Romania > Nord-Vest Development Region > Cluj County > Cluj-Napoca (0.04)
- Europe > Croatia > Zagreb County > Zagreb (0.04)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.92)
KNARsack: Teaching Neural Algorithmic Reasoners to Solve Pseudo-Polynomial Problems
Požgaj, Stjepan, Georgiev, Dobrik, Šilić, Marin, Veličković, Petar
Neural algorithmic reasoning (NAR) is a growing field that aims to embed algorithmic logic into neural networks by imitating classical algorithms. In this extended abstract, we detail our attempt to build a neural algorithmic reasoner that can solve Knapsack, a pseudo-polynomial problem bridging classical algorithms and combinatorial optimisation, but omitted in standard NAR benchmarks. Our neural algorithmic reasoner is designed to closely follow the two-phase pipeline for the Knapsack problem, which involves first constructing the dynamic programming table and then reconstructing the solution from it. The approach, which models intermediate states through dynamic programming supervision, achieves better generalization to larger problem instances than a direct-prediction baseline that attempts to select the optimal subset only from the problem inputs.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Russia (0.04)
- (3 more...)
Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs
Cattaneo, Alberto, Luschi, Carlo, Justus, Daniel
Retrieval of information from graph-structured knowledge bases represents a promising direction for improving the factuality of LLMs. While various solutions have been proposed, a comparison of methods is difficult due to the lack of challenging QA datasets with ground-truth targets for graph retrieval. We present SynthKGQA, an LLM-powered framework for generating high-quality Knowledge Graph Question Answering datasets from any Knowledge Graph, providing the full set of ground-truth facts in the KG to reason over questions. We show how, in addition to enabling more informative benchmarking of KG retrievers, the data produced with SynthKGQA also allows us to train better models.We apply SynthKGQA to Wikidata to generate GTSQA, a new dataset designed to test zero-shot generalization abilities of KG retrievers with respect to unseen graph structures and relation types, and benchmark popular solutions for KG-augmented LLMs on it.
- Leisure & Entertainment > Games > Computer Games (0.68)
- Leisure & Entertainment > Sports > Soccer (0.68)
- Media > Film (0.68)
- Government (0.67)
VeriLoRA: Fine-Tuning Large Language Models with Verifiable Security via Zero-Knowledge Proofs
Liao, Guofu, Wang, Taotao, Zhang, Shengli, Zhang, Jiqun, Long, Shi, Tao, Dacheng
Fine-tuning large language models (LLMs) is crucial for adapting them to specific tasks, yet it remains computationally demanding and raises concerns about correctness and privacy, particularly in untrusted environments. Although parameter-efficient methods like Low-Rank Adaptation (LoRA) significantly reduce resource requirements, ensuring the security and verifiability of fine-tuning under zero-knowledge constraints remains an unresolved challenge. To address this, we introduce VeriLoRA, the first framework to integrate LoRA fine-tuning with zero-knowledge proofs (ZKPs), achieving provable security and correctness. VeriLoRA employs advanced cryptographic techniques -- such as lookup arguments, sumcheck protocols, and polynomial commitments -- to verify both arithmetic and non-arithmetic operations in Transformer-based architectures. The framework provides end-to-end verifiability for forward propagation, backward propagation, and parameter updates during LoRA fine-tuning, while safeguarding the privacy of model parameters and training data. Leveraging GPU-based implementations, VeriLoRA demonstrates practicality and efficiency through experimental validation on open-source LLMs like LLaMA, scaling up to 13 billion parameters. By combining parameter-efficient fine-tuning with ZKPs, VeriLoRA bridges a critical gap, enabling secure and trustworthy deployment of LLMs in sensitive or untrusted environments.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- (5 more...)
A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders
Wang, Yizheng, Jiang, Tao, Bai, Jinyan, Zou, Zhengbin, Xue, Tiancheng, Zhang, Nan, Luan, Jie
Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.
- Europe > Germany (0.04)
- Europe > Croatia > Zagreb County > Zagreb (0.04)
- Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
- (2 more...)
Findings of the BlackboxNLP 2025 Shared Task: Localizing Circuits and Causal Variables in Language Models
Arad, Dana, Belinkov, Yonatan, Chen, Hanjie, Kim, Najoung, Mohebbi, Hosein, Mueller, Aaron, Sarti, Gabriele, Tutek, Martin
Mechanistic interpretability (MI) seeks to uncover how language models (LMs) implement specific behaviors, yet measuring progress in MI remains challenging. The recently released Mechanistic Interpretability Benchmark (MIB; Mueller et al., 2025) provides a standardized framework for evaluating circuit and causal variable localization. Building on this foundation, the BlackboxNLP 2025 Shared Task extends MIB into a community-wide reproducible comparison of MI techniques. The shared task features two tracks: circuit localization, which assesses methods that identify causally influential components and interactions driving model behavior, and causal variable localization, which evaluates approaches that map activations into interpretable features. With three teams spanning eight different methods, participants achieved notable gains in circuit localization using ensemble and regularization strategies for circuit discovery. With one team spanning two methods, participants achieved significant gains in causal variable localization using low-dimensional and non-linear projections to featurize activation vectors. The MIB leaderboard remains open; we encourage continued work in this standard evaluation framework to measure progress in MI research going forward.
- Europe > Austria > Vienna (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (4 more...)
Edge-Based Predictive Data Reduction for Smart Agriculture: A Lightweight Approach to Efficient IoT Communication
Krekovic, Dora, Kusek, Mario, Zarko, Ivana Podnar, Le-Phuoc, Danh
The rapid growth of IoT devices has led to an enormous amount of sensor data that requires transmission to cloud servers for processing, resulting in excessive network congestion, increased latency and high energy consumption. This is particularly problematic in resource-constrained and remote environments where bandwidth is limited, and battery-dependent devices further emphasize the problem. Moreover, in domains such as agriculture, consecutive sensor readings often have minimal variation, making continuous data transmission inefficient and unnecessarily resource intensive. To overcome these challenges, we propose an analytical prediction algorithm designed for edge computing environments and validated through simulation. The proposed solution utilizes a predictive filter at the network edge that forecasts the next sensor data point and triggers data transmission only when the deviation from the predicted value exceeds a predefined tolerance. A complementary cloud-based model ensures data integrity and overall system consistency. This dual-model strategy effectively reduces communication overhead and demonstrates potential for improving energy efficiency by minimizing redundant transmissions. In addition to reducing communication load, our approach leverages both in situ and satellite observations from the same locations to enhance model robustness. It also supports cross-site generalization, enabling models trained in one region to be effectively deployed elsewhere without retraining. This makes our solution highly scalable, energy-aware, and well-suited for optimizing sensor data transmission in remote and bandwidth-constrained IoT environments.
- North America > United States (0.04)
- Europe > Croatia > Zagreb County > Zagreb (0.04)
- Europe > Germany > Berlin (0.04)
- Research Report (1.00)
- Overview (1.00)
- Information Technology (1.00)
- Food & Agriculture > Agriculture (1.00)
- Energy (1.00)
- Information Technology > Internet of Things (1.00)
- Information Technology > Data Science (1.00)
- Information Technology > Communications > Networks (1.00)
- (2 more...)
CRISP: Persistent Concept Unlearning via Sparse Autoencoders
Ashuach, Tomer, Arad, Dana, Mueller, Aaron, Tutek, Martin, Belinkov, Yonatan
As large language models (LLMs) are increasingly deployed in real-world applications, the need to selectively remove unwanted knowledge while preserving model utility has become paramount. Recent work has explored sparse autoencoders (SAEs) to perform precise interventions on monosemantic features. However, most SAE-based methods operate at inference time, which does not create persistent changes in the model's parameters. Such interventions can be bypassed or reversed by malicious actors with parameter access. We introduce CRISP, a parameter-efficient method for persistent concept unlearning using SAEs. CRISP automatically identifies salient SAE features across multiple layers and suppresses their activations. We experiment with two LLMs and show that our method outperforms prior approaches on safety-critical unlearning tasks from the WMDP benchmark, successfully removing harmful knowledge while preserving general and in-domain capabilities. Feature-level analysis reveals that CRISP achieves semantically coherent separation between target and benign concepts, allowing precise suppression of the target features.
- Europe > Austria > Vienna (0.14)
- North America > United States > Virginia (0.04)
- North America > Canada (0.04)
- (7 more...)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- (98 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Education > Health & Safety > School Nutrition (0.93)
- Health & Medicine > Consumer Health (0.93)
A Communication-Latency-Aware Co-Simulation Platform for Safety and Comfort Evaluation of Cloud-Controlled ICVs
Zhao, Yongqi, Zhang, Xinrui, Mihalj, Tomislav, Schabauer, Martin, Putzer, Luis, Reichmann-Blaga, Erik, Boronyák, Ádám, Rövid, András, Soós, Gábor, Zhang, Peizhi, Xiong, Lu, Hu, Jia, Eichberger, Arno
Testing cloud-controlled intelligent connected vehicles (ICVs) requires simulation environments that faithfully emulate both vehicle behavior and realistic communication latencies. This paper proposes a latency-aware co-simulation platform integrating CarMaker and Vissim to evaluate safety and comfort under real-world vehicle-to-cloud (V2C) latency conditions. Two communication latency models, derived from empirical 5G measurements in China and Hungary, are incorporated and statistically modeled using Gamma distributions. A proactive conflict module (PCM) is proposed to dynamically control background vehicles and generate safety-critical scenarios. The platform is validated through experiments involving an exemplary system under test (SUT) across six testing conditions combining two PCM modes (enabled/disabled) and three latency conditions (none, China, Hungary). Safety and comfort are assessed using metrics including collision rate, distance headway, post-encroachment time, and the spectral characteristics of longitudinal acceleration. Results show that the PCM effectively increases driving environment criticality, while V2C latency primarily affects ride comfort. These findings confirm the platform's effectiveness in systematically evaluating cloud-controlled ICVs under diverse testing conditions.
- North America > United States (0.46)
- Europe > Austria > Styria > Graz (0.06)
- Europe > Hungary > Budapest > Budapest (0.05)
- (10 more...)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Transportation > Infrastructure & Services (0.68)
- (2 more...)